Prompt
- Glosary
- LLM Settings
- Structure
- Techniques
- Context Engineering
- Development
- Prompt Engineering: process of carefully designing and optimizing instructions (prompts) to elicit the best possible output from generative AI models, especially Large Language Models (LLMs). By providing clear, specific, and well-structured prompts, you can guide the AI to generate relevant, accurate, and high-quality responses
- Prompt: input you provide to a generative AI model to request a specific output. It can be a simple question, a set of instructions, or even a creative writing example
- Large Language Model (LLM): AI model designed to understand and generate human-like text. LLMs are trained on vast amounts of data and can perform tasks like translation, summarization, and even creative writing
- Prompt Template: a pre-defined structure or format for a prompt that can be customized with specific details or variables to generate dynamic prompts
- Prompt Tuning: process of fine-tuning pre-trained LLMs by adapting them to specific tasks or domains through prompt engineering, rather than traditional fine-tuning methods
- Prompt Injection: a security vulnerability where an attacker manipulates the input prompt to influence the AI model's behavior in unintended ways, potentially leading to unauthorized actions or disclosures
- Prompt Leakage: situation where sensitive information from the prompt is inadvertently included in the generated output, posing privacy or security risks
- Prompt Bias: tendency of an AI model to generate responses that reflect the biases present in its training data, leading to unfair or inaccurate outcomes
- Prompt Hallucination: when an AI model generates information that is not supported by the input prompt or its training data, leading to false or misleading outputs
- Prompt Testing: process of evaluating and validating prompts to ensure they produce the desired output, meet quality standards, and comply with ethical and regulatory requirements
- Prompt Optimization: continuous process of refining prompts to improve their performance, based on feedback, testing results, and changes in the AI model or its training data
- Context Window: max number of tokens the model can process at once, including input and output. Often a model-specific architectural limit
Category | Setting Parameter | Description | Low Value Use Cases | High Value Use Cases |
---|---|---|---|---|
Sampling | Temperature | controls the randomness or "creativity" of the output. Higher values lead to more diverse and imaginative responses, while lower values make the output more deterministic and focused | factual Q&A, summarization | story generation, poetry, brainstorming |
Top-P (Nucleus Sampling) | selects tokens from the smallest possible set whose cumulative probability exceeds the top_p threshold. Works in conjunction with temperature to control diversity | precise answers | varied and imaginative text | |
Top-K Sampling | limits the token selection to the top k most probable tokens at each step. The model will only consider words within this k set. Often used in conjunction with Top-P | limits token selection to the top k options for more focused output | expands token options for greater diversity and creativity, but may include less relevant choices | |
Advanced Sampling | Logit Bias | allows you to modify the probability of specific tokens appearing or not appearing in the generated output. You can increase or decrease the likelihood of certain words | reduces the likelihood of tokens with negative bias, prompting the model to avoid specific words | increases the likelihood of tokens with positive bias, encouraging the model to include specific words or phrases |
Output Control | Max Length / Max Tokens | sets the maximum number of tokens the model will generate in its response. This includes both the input prompt and the generated output in some APIs | summarization, quick answers: concise, cost-effective responses, cutting off if necessary | essay generation, code generation, detailed explanations: more detailed responses, but manage to avoid irrelevance and high costs |
Stop Sequences | string or list of strings that, when encountered in the generated output, will stop the model from generating further tokens | stops generating text at specified sequences, ensuring structured outputs and preventing run-ons | continues generating until reaching max tokens or an end-of-text token | |
N (Number of Completions) | specifies how many independent completions (responses) the model should generate for a single prompt | produces one response, typical for direct answers | creates several distinct responses for selection or variation, potentially increasing cost | |
Repetition Control | Frequency Penalty | applies a penalty to new tokens based on how many times that token has already appeared in the text (prompt + generated response) | allows repetition with less penalty, increasing the likelihood of repeated words or phrases | imposes a higher penalty on repetition, promoting new vocabulary and discouraging repeated tokens |
Repetition Control | Presence Penalty | imposes a uniform penalty on new tokens that have appeared in the text at least once, regardless of their frequency | reduces penalties on previously mentioned tokens to maintain focus on a specific topic | increases penalties on previously used tokens to encourage diverse and distinct ideas |
Reproducibility | Seed | setting a seed makes the model's output deterministic for a given set of parameters | guarantees consistent results for repeated calls with the same prompt and settings, aiding debugging and reproducibility | each call with the same prompt and settings yields a different output, while still adhering to other parameters |
Input Processing | Context Window (Max Context Length) | maximum number of tokens (input prompt + generated output) that the model can process and consider at one time. This is often a model-specific architectural limit | short prompts limit the model's memory of prior conversation, causing context loss in longer interactions | long conversations and large document analysis allow the model to maintain context, enhancing coherence and relevance in extended interactions |
Model Selection | Model Name/ID | specifies the particular LLM variant or version to be used. Different models have varying capabilities, sizes, and training data | smaller models may produce lower quality, less nuanced responses and have limited capabilities | larger models generally provide higher quality, more nuanced responses, but may incur higher costs and slower inference |
Generation Strategy | Decoding Type | refers to the algorithm used to select the next token. Common types include greedy decoding, beam search, and sampling (which involves temperature, top-p, top-k) | Greedy" selection yields deterministic but potentially less creative output by always choosing the highest probabilit | Sampling" adds variability, while "beam search" explores multiple sequences to identify more globally optimal output |
Aspect | Definition | Example |
---|---|---|
Task Context | briefly describe the overall task or objective to provide context for the model | You will be acting as an AI career coach named Joe created by the company AdAstra Careers. Your goal is to give career advice to users. You will be replying to users who are on the AdAstra site and who will be confused if you don't respond in the character of Joe |
Tone Context | specify the desired tone or style (e.g., formal, casual, technical, humorous) | You should maintain a friendly customer service tone |
Background data, documents, and images | provide any relevant background information, documents, or images that can help the model understand the context | Here is the career guidance document you should reference when answering the user: <guide>{{DOCUMENT}}</guide> |
Detailed task description & rules | outline the specific requirements, constraints, and rules for the task | Here are some important rules for the interaction:
|
Examples | include examples that illustrate the desired output or behavior | Here is an example of how to respond in a standard interaction:
|
Conversation history | provide context from previous interactions that may be relevant to the current task | Here is the conversation history (between the user and you) prior to the question. It could be empty if there is no history: <history> {{HISTORY}} </history> Here is the user's question: <question> {{QUESTION}} </question> |
Immediate task description or request | clearly state the task or question at hand | How do you respond to the user's question? |
Thinking step by step / take a deep breath | encourage a methodical approach to problem-solving | Think about your answer first before you respond |
Output formatting | specify any required formatting for the response | Put your response in <response></response> tags |
Prefilled response (if any) | include any pre-existing responses that may be relevant | <response> |
Technique | Definition | Explanation | Example |
---|---|---|---|
Zero-Shot Prompting | Perform tasks without examples | AI leverages pre-trained knowledge to handle new tasks without specific examples | Classify this text as positive or negative sentiment |
One-Shot Prompting | Learn from single example | Provide one example to guide AI's understanding of the desired task format | Translate 'hello' to French: 'bonjour' |
Few-Shot Learning | Learn from multiple examples | Supply 2-5 examples to demonstrate task patterns and expected outputs | 1+1=2, 2+2=4, 3+3=6. What is 4+4? |
Chain of Thought (CoT) | Step-by-step reasoning | Guide AI to break down complex problems into logical intermediate steps | If I have 3 apples and buy 2 more, then give away 1, how many do I have left? Let's think step by step.. |
Zero-Shot Chain of Thought (CoT) | CoT without examples | Use simple instruction like "Let's think step by step" to trigger reasoning | Calculate 15*7. Let's think step by step: |
Multimodal Chain of Thought (CoT) | CoT with multiple data types | Combine text, images, and other modalities in reasoning process | Analyze this image and describe the step-by-step process shown |
Auto Chain of Thought (CoT) | Automated CoT generation | Automatically generate reasoning chains through clustering and pattern recognition | AI generates its own step-by-step reasoning paths |
Constrained Generation | Limit output format | Restrict AI responses to specific formats, lengths, or structures | List exactly 5 items in bullet points, no more than 10 words each |
Contextual Prompts (RAG) | Use external context | Incorporate relevant external information to ground responses | Based on the provided company policy document, answer.. |
Effectiveness Evaluation | Measure prompt quality | Assess how well prompts achieve desired outcomes using specific metrics | Compare response quality across different prompt variations |
Ethical Considerations | Ensure responsible AI use | Design prompts that avoid bias, misinformation, and harmful content | Include fairness constraints and content safety guidelines |
Handling Ambiguity | Clarify unclear requests | Add specific constraints and context to reduce interpretation ambiguity | Summarize in exactly 3 bullet points under 50 words total |
Instruction Engineering | Craft clear directives | Write precise, unambiguous instructions for desired AI behavior | You are a technical writer. Explain quantum computing in simple terms |
Length Management | Control response size | Specify exact length requirements or use tokens to manage output | Provide a response between 100-200 words |
Meta-Prompting | Use AI to improve prompts | Employ one AI model to generate or optimize prompts for another | Improve this prompt to get better results: [original prompt] |
Multilingual Prompting | Handle multiple languages | Specify target language and cultural context for responses | Respond in Spanish using formal tone and Mexican cultural references |
Negative Prompting | Specify what to avoid | Explicitly state what the AI should not include in responses | Explain quantum physics without using mathematics or formulas |
Prompt Chaining | Link multiple prompts | Connect outputs of one prompt as inputs to subsequent prompts | Step 1: Analyze requirements. Step 2: Generate code based on analysis |
Prompt Formatting | Structure prompt layout | Use markdown, sections, and clear organization for better parsing | Format response as: ## Summary\n## Key Points\n## Conclusion |
Prompt Optimization | Refine prompts iteratively | Systematically improve prompts through testing and feedback loops | A/B test different prompt versions and measure performance |
Prompt Security | Prevent injection attacks | Design prompts resistant to malicious input manipulation | Validate and sanitize user inputs before processing |
Prompt Templates | Reusable prompt structures | Create parameterized templates for consistent, repeatable prompting | Generate [type] about [topic] for [audience] in [style] |
ReAct (Reason + Act) | Combine reasoning and actions | Alternate between thinking through problems and taking actions | Thought: I need to search for information. Action: Search [query] |
Rephrase and Respond (RaR) | Clarify before answering | Ask AI to rephrase questions for better understanding before responding | First rephrase this question, then provide your answer |
Role Prompting | Assign specific personas | Instruct AI to respond as particular characters or professionals | You are a senior software architect with 20 years experience |
Style Prompting | Control output style | Guide the AI in adopting specific tones, formats, or structures | Respond in a formal tone, using bullet points for clarity |
Explicit Instructions Prompting | Define clear, direct instructions | Provide explicit, unambiguous instructions for the AI to follow | "Summarize the following text in exactly three bullet points." |
Output Priming | Set expectations for output | Guide the AI on the desired format, style, or content of its responses | Respond with a summary in bullet points, no more than 10 words each |
Rephrase & Respond (RaR) | Clarify before answering | Ask AI to rephrase questions for better understanding before responding | First rephrase this question, then provide your answer |
Self-Consistency | Generate multiple solutions | Create several reasoning paths and select most consistent answer | Generate 5 different solutions and choose the most frequent |
Self-Critique & Refinement | AI evaluates own output | Have AI review and improve its own responses iteratively | Review your answer and identify any weaknesses or improvements |
Step-Back Prompting | Consider broader context | First ask about general principles before specific applications | What are the general principles of good UX design? Now apply them to.. |
System Prompting | Set behavioral guidelines | Define overarching rules and context for all interactions | You are a helpful assistant that always responds truthfully and safely |
Task Decomposition | Break complex tasks down | Divide large problems into smaller, manageable sub-tasks | Break this project into 5 specific, actionable steps |
Task-Specific Prompts | Tailor to particular tasks | Customize prompts for specific use cases or domains | Write a product description for an e-commerce website |
Tree of Thoughts (ToT) | Explore multiple reasoning branches | Create tree structure of thoughts with evaluation and backtracking | Explore 3 different approaches to solve this problem, evaluate each |
Aspect | Prompt Engineering | Context Engineering |
---|---|---|
Visualization | ||
Definition | the process of designing and optimizing prompts to produce desired responses from AI models. It is a subset of Context Engineering | the practice of structuring and managing the information provided to AI models to enhance their understanding and performance on specific tasks |
Focus | focuses on what to say to the model at a moment in time | focuses on what the model knows when you say it - and why it should care |
Purpose | get a specific response from a prompt. Usually one-off.
| make sure the model consistently performs well across sessions and tasks.
|
Mindset | crafting clear instructions | designing the entire flow and architecture of a model's thought process |
Scope | operates within a single input-output pair | handles everything the model sees - memory, history, tools, system prompts |
Repeatability | can be hit-or-miss and often needs manual tweaks | designed for consistency and reuse across many users and tasks |
Scalability | starts to fall apart when scaled - more users = more edge cases | built with scale in mind from the beginning |
Precision | relies heavily on wordsmithing to get things "just right" | focuses on delivering the right inputs at the right time, reducing the burden on the prompt itself |
Tools | prompt box | memory modules, RAG systems, API chaining, and more backend coordination |
Debugging | mostly rewording and guessing what went wrong | involves inspecting the full context window, memory slots, and token flow |
Use Cases | Copywriting variations; one-shot code generation | LLM Agents with memory; Customer support bots; Multi-turn flows |
Spec-Driven Development​
- Spec-Driven Development: emerging methodology that centers on detailed, structured specifications as living, executable documents describing software's what and why. AI-powered tools handle the how by translating intent into code, overcoming ad-hoc AI-assisted coding limits. Inspired by TDD and BDD, it adapts them for AI workflows, making specs the source of truth for implementation, validation, and iteration.
Key Principles​
- Specifications as Living Artifacts: Specs are dynamic, version-controlled documents (e.g., Markdown) that evolve, serving as the "North Star" for AI agents and teams
- Separation of Intent and Implementation: Focus on "what" (user needs, outcomes) in specs and "how" (architecture, stack) in plans
- Clarity and Unambiguity: Use precise language, consistent terminology, and structures (e.g., structs, loops in plain English) to minimize AI misinterpretation
- Iterative Refinement with Checkpoints: Validate outputs at each phase; developers critique and refine to spot gaps
- Incorporation of Constraints Early: Bake in security, compliance, performance, and integrations from the start
- Human Oversight: AI handles execution, but humans steer, review, and evaluate for reasonableness
- Raising Abstraction Levels: Shift from imperative ("how") to declarative ("what") programming, echoing historical leaps like high-level languages
Workflow​
Phase | Definition | Key Activities | Focus | Challenges | Outputs |
---|---|---|---|---|---|
Specify | capture high-level intent focusing on user needs and outcomes. Avoid technical details | provide prompts on "what" and "why"; AI generates detailed spec; developer reviews and refines | what/why | ambiguity, scope creep, vague user needs | living spec document (e.g., Markdown with user stories, journeys, criteria) |
Plan | add technical "how" elements like stack, architecture, constraints | share docs on standards, integrations; AI generates plans (possibly variations); review for alignment | how (high-level) | overlooking constraints | technical plan document, including alternatives and decisions |
Tasks | break down into small, isolated, actionable steps | AI decomposes spec and plan; tasks mimic TDD for AI, ensuring testability | breakdown | task granularity | task list (e.g., Markdown checklist) |
Implement | execute tasks with AI generating code | AI implements per task; run tests, linting; developer reviews changes incrementally | execution | context loss in AI | code, tests, validated builds |
Review & Iterate | verify overall alignment; update spec for changes | test app; lint spec for clarity; regenerate as needed | validation | misalignment with original intent | refined artifacts; deployed features |